✂️ Semantic Segmentatition

Introduction

Instead of predicting one label (cat, dog, etc.) per image, we will predict one label per pixel!

Each pixel should belong to a class (cat, dog, etc.) or to a background class.

Applications

Autonomous driving	Medicine

Representing the task

Similar to how we treat standard categorical values, we’ll create our target by one-hot encoding the class labels - essentially creating an output channel for each of the possible classes.

Models

Note that the model backbone can be a resnet, densenet, inception…

Naive model: Convolutions + Transpose Convolutions (stride=2)

Better model: Convs + TransposeConvs(stride=2) + Residual connections = UNET

History of the styate of the art

Name	Description	Date	Instances
FCN	Fully Convolutional Network	2014
SegNet	Encoder-decorder	2015
Unet	Concatenate like a densenet	2015
DeepLab	Atrous Convolution and CRF	2016
ENet	Real-time video segmentation	2016
PSPNet	Pyramid Scene Parsing Net	2016
FPN	Feature Pyramid Networks slides	2016	Yes
DeepLabv3	Increasing dilatation & field-of-view	2017
LinkNet	Adds like a resnet	2017
DeepLabv3+		2018
PANet	Path Aggregation Network	2018	Yes
Panop FPN	Panoptic Feature Pyramid Networks	2019	?
PointRend	Image Segmentation as Rendering	2019	?

Post-processing (OPTIONAL)

Conditional Random Fields (CRF)
Grabcut
- Deep cat

Metric ands losses

Pixel-wise cross entropy
IoU (F0): (Pred ∩ GT)/(Pred ∪ GT) = TP / TP + FP * FN
Dice (F1): 2 * (Pred ∩ GT)/(Pred + GT) = 2·TP / 2·TP + FP * FN
- Range from 0 (worst) to 1 (best)
- In order to formulate a loss function which can be minimized, we’ll simply use 1 − Dice

Pixel-wise cross entropy

Dice loss

Notebook: CAMVID dataset

Reference

Blog: An overview of semantic image segmentation
Image Segmentation Using Deep Learning: A Survey Nov 2020
https://www.jeremyjordan.me/semantic-segmentation
https://www.jeremyjordan.me/evaluating-image-segmentation-models
Check Res2Net
Check catalyst segmentation tutorial (Ranger opt, albumentations, …)
this repo